Overview
Brought to you by YData
Dataset statistics
| Number of variables | 6 |
|---|---|
| Number of observations | 91216745 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 28.1 GiB |
| Average record size in memory | 331.3 B |
Variable types
| Text | 4 |
|---|---|
| Numeric | 1 |
| Categorical | 1 |
Reproduction
| Analysis started | 2025-03-04 04:34:12.088591 |
|---|---|
| Analysis finished | 2025-03-04 04:51:29.848545 |
| Duration | 17 minutes and 17.76 seconds |
| Software version | ydata-profiling vv4.12.2 |
| Download configuration | config.json |
Variables
tconst
Text
| Distinct | 10407908 |
|---|---|
| Distinct (%) | 11.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.6 GiB |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 9.4413666 |
| Min length | 9 |
Unique
| Unique | 993397 ? |
|---|---|
| Unique (%) | 1.1% |
Sample
| 1st row | tt0000001 |
|---|---|
| 2nd row | tt0000001 |
| 3rd row | tt0000001 |
| 4th row | tt0000001 |
| 5th row | tt0000002 |
| Value | Count | Frequency (%) |
| tt0398022 | 75 | < 0.1% |
| tt5659710 | 69 | < 0.1% |
| tt1438495 | 66 | < 0.1% |
| tt0298590 | 65 | < 0.1% |
| tt0406599 | 64 | < 0.1% |
| tt0365033 | 62 | < 0.1% |
| tt10093312 | 59 | < 0.1% |
| tt2074491 | 59 | < 0.1% |
| tt10093280 | 59 | < 0.1% |
| tt1245530 | 59 | < 0.1% |
| Other values (10407898) | 91216108 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 182433490 | |
| 1 | 87087770 | |
| 2 | 82145060 | |
| 0 | 78314659 | |
| 4 | 68164247 | 7.9% |
| 8 | 66195867 | 7.7% |
| 6 | 65953532 | 7.7% |
| 3 | 64886385 | 7.5% |
| 5 | 56739838 | 6.6% |
| 7 | 55505727 | 6.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 861210726 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| t | 182433490 | |
| 1 | 87087770 | |
| 2 | 82145060 | |
| 0 | 78314659 | |
| 4 | 68164247 | 7.9% |
| 8 | 66195867 | 7.7% |
| 6 | 65953532 | 7.7% |
| 3 | 64886385 | 7.5% |
| 5 | 56739838 | 6.6% |
| 7 | 55505727 | 6.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 861210726 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| t | 182433490 | |
| 1 | 87087770 | |
| 2 | 82145060 | |
| 0 | 78314659 | |
| 4 | 68164247 | 7.9% |
| 8 | 66195867 | 7.7% |
| 6 | 65953532 | 7.7% |
| 3 | 64886385 | 7.5% |
| 5 | 56739838 | 6.6% |
| 7 | 55505727 | 6.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 861210726 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| t | 182433490 | |
| 1 | 87087770 | |
| 2 | 82145060 | |
| 0 | 78314659 | |
| 4 | 68164247 | 7.9% |
| 8 | 66195867 | 7.7% |
| 6 | 65953532 | 7.7% |
| 3 | 64886385 | 7.5% |
| 5 | 56739838 | 6.6% |
| 7 | 55505727 | 6.4% |
ordering
Real number (ℝ)
| Distinct | 75 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.0149965 |
| Minimum | 1 |
|---|---|
| Maximum | 75 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 695.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 6 |
| Q3 | 10 |
| 95-th percentile | 17 |
| Maximum | 75 |
| Range | 74 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 5.1561668 |
|---|---|
| Coefficient of variation (CV) | 0.73502058 |
| Kurtosis | 0.93296367 |
| Mean | 7.0149965 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 1.028796 |
| Sum | 6.3988515 × 108 |
| Variance | 26.586056 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 10407908 | |
| 2 | 9414511 | |
| 3 | 8540051 | |
| 4 | 7856183 | 8.6% |
| 5 | 7129889 | 7.8% |
| 6 | 6490969 | 7.1% |
| 7 | 5913737 | 6.5% |
| 8 | 5395459 | 5.9% |
| 9 | 4881306 | 5.4% |
| 10 | 4401038 | 4.8% |
| Other values (65) | 20785694 |
| Value | Count | Frequency (%) |
| 1 | 10407908 | |
| 2 | 9414511 | |
| 3 | 8540051 | |
| 4 | 7856183 | |
| 5 | 7129889 | |
| 6 | 6490969 | |
| 7 | 5913737 | |
| 8 | 5395459 | |
| 9 | 4881306 | |
| 10 | 4401038 |
| Value | Count | Frequency (%) |
| 75 | 1 | < 0.1% |
| 74 | 1 | < 0.1% |
| 73 | 1 | < 0.1% |
| 72 | 1 | < 0.1% |
| 71 | 1 | < 0.1% |
| 70 | 1 | < 0.1% |
| 69 | 2 | |
| 68 | 2 | |
| 67 | 2 | |
| 66 | 3 |
nconst
Text
| Distinct | 6604924 |
|---|---|
| Distinct (%) | 7.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.6 GiB |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 9.1491478 |
| Min length | 9 |
Unique
| Unique | 3495458 ? |
|---|---|
| Unique (%) | 3.8% |
Sample
| 1st row | nm1588970 |
|---|---|
| 2nd row | nm0005690 |
| 3rd row | nm0005690 |
| 4th row | nm0374658 |
| 5th row | nm0721526 |
| Value | Count | Frequency (%) |
| nm0438471 | 37824 | < 0.1% |
| nm0438506 | 31543 | < 0.1% |
| nm7370686 | 28893 | < 0.1% |
| nm8467983 | 28389 | < 0.1% |
| nm6352729 | 26027 | < 0.1% |
| nm0914844 | 25565 | < 0.1% |
| nm0251041 | 25370 | < 0.1% |
| nm1203430 | 22776 | < 0.1% |
| nm2273814 | 21679 | < 0.1% |
| nm5042664 | 20490 | < 0.1% |
| Other values (6604914) | 90948189 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 92125022 | |
| n | 91216745 | |
| m | 91216745 | |
| 1 | 85858004 | |
| 2 | 65076703 | |
| 3 | 62384922 | |
| 4 | 61120605 | |
| 5 | 59483310 | |
| 6 | 58104019 | |
| 7 | 56425759 | |
| Other values (2) | 111543651 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 834555485 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 92125022 | |
| n | 91216745 | |
| m | 91216745 | |
| 1 | 85858004 | |
| 2 | 65076703 | |
| 3 | 62384922 | |
| 4 | 61120605 | |
| 5 | 59483310 | |
| 6 | 58104019 | |
| 7 | 56425759 | |
| Other values (2) | 111543651 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 834555485 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 92125022 | |
| n | 91216745 | |
| m | 91216745 | |
| 1 | 85858004 | |
| 2 | 65076703 | |
| 3 | 62384922 | |
| 4 | 61120605 | |
| 5 | 59483310 | |
| 6 | 58104019 | |
| 7 | 56425759 | |
| Other values (2) | 111543651 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 834555485 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 92125022 | |
| n | 91216745 | |
| m | 91216745 | |
| 1 | 85858004 | |
| 2 | 65076703 | |
| 3 | 62384922 | |
| 4 | 61120605 | |
| 5 | 59483310 | |
| 6 | 58104019 | |
| 7 | 56425759 | |
| Other values (2) | 111543651 |
category
Categorical
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.4 GiB |
| actor | |
|---|---|
| actress | |
| self | |
| writer | |
| director | |
| Other values (8) |
Length
| Max length | 19 |
|---|---|
| Median length | 15 |
| Mean length | 6.7319859 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | self |
|---|---|
| 2nd row | director |
| 3rd row | producer |
| 4th row | cinematographer |
| 5th row | director |
Common Values
| Value | Count | Frequency (%) |
| actor | 21807042 | |
| actress | 16375545 | |
| self | 13169669 | |
| writer | 10930025 | |
| director | 7864950 | 8.6% |
| producer | 6876973 | 7.5% |
| editor | 4817218 | 5.3% |
| cinematographer | 3673212 | 4.0% |
| composer | 2964324 | 3.2% |
| production_designer | 1094279 | 1.2% |
| Other values (3) | 1643508 | 1.8% |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| actor | 21807042 | |
| actress | 16375545 | |
| self | 13169669 | |
| writer | 10930025 | |
| director | 7864950 | 8.6% |
| producer | 6876973 | 7.5% |
| editor | 4817218 | 5.3% |
| cinematographer | 3673212 | 4.0% |
| composer | 2964324 | 3.2% |
| production_designer | 1094279 | 1.2% |
| Other values (3) | 1643508 | 1.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 109557268 | |
| e | 74740376 | |
| t | 69266959 | |
| c | 63370586 | |
| o | 55363291 | |
| s | 51059688 | |
| a | 47735701 | |
| i | 32188224 | 5.2% |
| d | 22828025 | 3.7% |
| p | 14608788 | 2.4% |
| Other values (10) | 73350937 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 614069843 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| r | 109557268 | |
| e | 74740376 | |
| t | 69266959 | |
| c | 63370586 | |
| o | 55363291 | |
| s | 51059688 | |
| a | 47735701 | |
| i | 32188224 | 5.2% |
| d | 22828025 | 3.7% |
| p | 14608788 | 2.4% |
| Other values (10) | 73350937 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 614069843 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| r | 109557268 | |
| e | 74740376 | |
| t | 69266959 | |
| c | 63370586 | |
| o | 55363291 | |
| s | 51059688 | |
| a | 47735701 | |
| i | 32188224 | 5.2% |
| d | 22828025 | 3.7% |
| p | 14608788 | 2.4% |
| Other values (10) | 73350937 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 614069843 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| r | 109557268 | |
| e | 74740376 | |
| t | 69266959 | |
| c | 63370586 | |
| o | 55363291 | |
| s | 51059688 | |
| a | 47735701 | |
| i | 32188224 | 5.2% |
| d | 22828025 | 3.7% |
| p | 14608788 | 2.4% |
| Other values (10) | 73350937 |
job
Text
| Distinct | 44233 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.1 GiB |
Length
| Max length | 290 |
|---|---|
| Median length | 2 |
| Mean length | 3.429542 |
| Min length | 1 |
Unique
| Unique | 30519 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | \N |
|---|---|
| 2nd row | \N |
| 3rd row | producer |
| 4th row | director of photography |
| 5th row | \N |
| Value | Count | Frequency (%) |
| n | 74170663 | |
| producer | 6878538 | 7.1% |
| writer | 2182400 | 2.3% |
| director | 1692410 | 1.8% |
| by | 1633071 | 1.7% |
| editor | 875181 | 0.9% |
| written | 738268 | 0.8% |
| composer | 593916 | 0.6% |
| created | 574716 | 0.6% |
| creator | 570499 | 0.6% |
| Other values (32826) | 6675978 | 6.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| N | 74177859 | |
| \ | 74170629 | |
| r | 29767312 | |
| e | 20412315 | 6.5% |
| o | 16068325 | 5.1% |
| c | 12587115 | 4.0% |
| d | 11949743 | 3.8% |
| t | 11163896 | 3.6% |
| p | 10241585 | 3.3% |
| i | 8996459 | 2.9% |
| Other values (142) | 43296423 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 312831661 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| N | 74177859 | |
| \ | 74170629 | |
| r | 29767312 | |
| e | 20412315 | 6.5% |
| o | 16068325 | 5.1% |
| c | 12587115 | 4.0% |
| d | 11949743 | 3.8% |
| t | 11163896 | 3.6% |
| p | 10241585 | 3.3% |
| i | 8996459 | 2.9% |
| Other values (142) | 43296423 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 312831661 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| N | 74177859 | |
| \ | 74170629 | |
| r | 29767312 | |
| e | 20412315 | 6.5% |
| o | 16068325 | 5.1% |
| c | 12587115 | 4.0% |
| d | 11949743 | 3.8% |
| t | 11163896 | 3.6% |
| p | 10241585 | 3.3% |
| i | 8996459 | 2.9% |
| Other values (142) | 43296423 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 312831661 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| N | 74177859 | |
| \ | 74170629 | |
| r | 29767312 | |
| e | 20412315 | 6.5% |
| o | 16068325 | 5.1% |
| c | 12587115 | 4.0% |
| d | 11949743 | 3.8% |
| t | 11163896 | 3.6% |
| p | 10241585 | 3.3% |
| i | 8996459 | 2.9% |
| Other values (142) | 43296423 |
characters
Text
| Distinct | 4237735 |
|---|---|
| Distinct (%) | 4.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 GiB |
Length
| Max length | 463 |
|---|---|
| Median length | 2 |
| Mean length | 8.4807145 |
| Min length | 2 |
Unique
| Unique | 2966501 ? |
|---|---|
| Unique (%) | 3.3% |
Sample
| 1st row | ["Self"] |
|---|---|
| 2nd row | \N |
| 3rd row | \N |
| 4th row | \N |
| 5th row | \N |
| Value | Count | Frequency (%) |
| n | 47027740 | |
| self | 13179585 | 9.8% |
| 7795731 | 5.8% | |
| host | 2276811 | 1.7% |
| guest | 577027 | 0.4% |
| the | 467635 | 0.3% |
| presenter | 446236 | 0.3% |
| dr | 432712 | 0.3% |
| contestant | 393532 | 0.3% |
| de | 334408 | 0.2% |
| Other values (1064934) | 61387383 |
Most occurring characters
| Value | Count | Frequency (%) |
| " | 88516459 | 11.4% |
| e | 50780469 | 6.6% |
| N | 49081193 | 6.3% |
| \ | 47157395 | 6.1% |
| [ | 44207085 | 5.7% |
| ] | 44206941 | 5.7% |
| 43102142 | 5.6% | |
| a | 40421177 | 5.2% |
| r | 29161906 | 3.8% |
| l | 29058448 | 3.8% |
| Other values (191) | 307889956 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 773583171 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| " | 88516459 | 11.4% |
| e | 50780469 | 6.6% |
| N | 49081193 | 6.3% |
| \ | 47157395 | 6.1% |
| [ | 44207085 | 5.7% |
| ] | 44206941 | 5.7% |
| 43102142 | 5.6% | |
| a | 40421177 | 5.2% |
| r | 29161906 | 3.8% |
| l | 29058448 | 3.8% |
| Other values (191) | 307889956 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 773583171 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| " | 88516459 | 11.4% |
| e | 50780469 | 6.6% |
| N | 49081193 | 6.3% |
| \ | 47157395 | 6.1% |
| [ | 44207085 | 5.7% |
| ] | 44206941 | 5.7% |
| 43102142 | 5.6% | |
| a | 40421177 | 5.2% |
| r | 29161906 | 3.8% |
| l | 29058448 | 3.8% |
| Other values (191) | 307889956 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 773583171 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| " | 88516459 | 11.4% |
| e | 50780469 | 6.6% |
| N | 49081193 | 6.3% |
| \ | 47157395 | 6.1% |
| [ | 44207085 | 5.7% |
| ] | 44206941 | 5.7% |
| 43102142 | 5.6% | |
| a | 40421177 | 5.2% |
| r | 29161906 | 3.8% |
| l | 29058448 | 3.8% |
| Other values (191) | 307889956 |
Interactions
Correlations
| category | ordering | |
|---|---|---|
| category | 1.000 | 0.226 |
| ordering | 0.226 | 1.000 |
Missing values
A simple visualization of nullity by column.